The impact of IDH and NAT2 gene polymorphisms in acute myeloid leukemia risk and overall survival in an Arab population: A case-control study

Acute myeloid leukemia (AML) is a malignancy of the myeloid cells due to the clonal and malignant proliferation of blast cells. The etiology of AML is complex and involves environmental and genetic factors. Such genetic aberrations include FLT3, DNMT3, IDH1, IDH2, NAT2, and WT. In this study, we analyzed the relationship between five, not previously studied in any Arab population, single nucleotide polymorphisms (SNPs) and the risk and overall survival of AML in Jordanian patients. The SNPs are NAT2 (rs1799930 and rs1799931), IDH1 (rs121913500), and IDH2 (rs121913502 and rs1057519736). A total number of 30 AML patients and 225 healthy controls were included in this study. Females comprised 50% (n = 15) and 65.3% (n = 147) of patients and controls, respectively. For AML patients (case group) Genomic DNA was extracted from formalin-fixed paraffin-embedded tissues and from peripheral blood samples for the control subjects group. Genotyping of the genetic polymorphisms was conducted using a sequencing protocol. Our study indicates that NAT2 rs1799930 SNP had a statistically significant difference in genotype frequency between cases and controls (p = 0.023) while IDH mutations did not correlate with the risk and survival of AML in the Jordanian population. These results were also similar in the TCGA-LAML cohorts with the notable exception of the rare NAT2 mutation. A larger cohort study is needed to further investigate our results.


Introduction
Acute myeloid leukemia (AML) is a malignancy of the myeloid cells due to the clonal and malignant proliferation of blast cells [1]. AML is the most prevalent type of acute leukemia in adults with a median age of 68 years. According to Globocan report, the 5-year prevalence of leukemia is 15.26 per 100000 in Jordan in 2020 [2]. The etiology of AML is complex and involves environmental and genetic factors [3]. Such environmental factors include radiation, smoking, obesity, and previous exposure to chemotherapy or radiation; and genetic aberrations include FLT3, DNMT3, IDH1, IDH2, NAT2, and WT1 [4,5]. As a result, research into somatic mutations and single-nucleotide polymorphisms (SNPs) has aided in gaining a better understanding of the mechanisms driving leukemogenesis, treatment response and improving patients' survival [6]. N-acetyltransferase 2 (NAT2) is a gene located in the short arm of chromosome 8 and plays a key role in metabolizing carcinogens such as aromatic amines, arylamines and hydrazines throughout acetylation reactions [7]. Such compounds are also found in cigarette smoke [8]. Different SNPs of NAT-2 have been identified which mark up to distinct populations, slow and rapid acetylator [9]. NAT*4 is the wild type form and considered the main example of rapid acetylator while the absence of it with the presence of the other alleles like rs1799930 and rs1799931 denotes a slow acetylator [10]. These variations have been discussed in light of autoimmunity, tuberculosis treatment, Parkinson's disease, and cancer [11][12][13]. Previous studies have shown a link between NAT2 slow acetylation phenotype and bladder, lung, colon, breast, liver, and gastric cancers [14][15][16][17]. Although some studies showed an association of NAT2 with increased risk and drug response in terms of AML, the results are conflicting with some studies showing these phenotypes did not affect AML risk [18][19][20].
Isocitrate dehydrogenases (IDH 1 and IDH2) are part of the tricarboxylic cycle and catalyze the reversible reaction between iso-citrate and α-ketoglutarate. IDH mutations have been heavily studied in brain gliomas [21]. It has been found that IDH mutations are closely related to the occurrence and prognosis of glioma [21]. Recently, the role of IDH mutations has been also investigated in AML. The incidence of IDH1 and IDH2 mutations have been associated with leukemogenesis with an incidence ranging from 8% to 12% [22]. These mutations have been associated with leukemogenesis by preventing the revisable reaction of isocitrate to alpha-ketoglutarate in the tricarboxylic acid cycle and, instead, lead to increased production of the oncogenic 2-hydroxyglutarate from alpha-ketoglutarate [23]. However, there is a conflicting evidence about IDH1/2 mutations' predictive effect on AML [24].
The aim of this study was, therefore, to analyze the relationship between five, not previously studied in any Arab population, single nucleotide polymorphisms (SNPs) and the risk and overall survival of AML in Jordanian Arab patients. The SNPs are NAT2 (rs1799930 and rs1799931), IDH1 (rs121913500), and IDH2 (rs121913502 and rs1057519736). In addition, we wanted to explore the value of these genes in publicly curated data at the multi-omics level.

Patients and data collection
Paraffin-embedded samples from AML patients (n = 30) were retrieved from the archives of King Abdullah University Hospital during the period of January 2013 to December 2021. All cases were reviewed by (SK) and one representative section was chosen from each case. The human ethics approval was attained by the ethical committee of Jordan University of Science and Technology [Institutional Review Board (IRB) code number 18/105/2017, dated 04/05/ 2017] in accordance with the 1964 Declaration of Helsinki and its later amendments. Formal written informed consent was not required with a waiver by the IRB. Control group samples were peripheral blood and all healthy controls (n = 225) were voluntarily involved and signed written informed consent. Cases' and controls' names were coded and blinded and treated confidentially. Authors had no access to information that could identify individual participants during or after data collection.

DNA extraction
Genomic DNA was extracted for the AML patients from formalin-fixed paraffin-embedded tissue using a commercially available kit, DNeasy Blood & Tissue Kit (Qiagen Ltd., West Sussex, UK), using the manufacturer's protocols. Genomic DNA from control-subjects' blood samples was extracted using the QIAamp1 or Promega DNA Mini Kit according to the manufacturer's instruction. The quality of extracted DNA was examined using agarose gel electrophoresis and ethidium bromide staining. The concentration and purity of extracted DNA were assessed using a NanoDrop 10001 spectrophotometer. The pure DNA samples with their concentrations were sent to the Australian Genome Research Facility (AGRF, Melbourne Node, Melbourne, Australia) for genotyping of five SNPs NAT2 (rs1799930 and rs1799931), IDH1 (rs121913500), and IDH2 (rs121913502 and rs1057519736) in all subjects (patients and controls). The SNPs, SNPs' position, and primer sequences are shown in Table 1. Genotyping with the Sequenom MassARRAY1 system (iPLEX GOLD) (Sequenom, San Diego, CA, USA) was performed at the AGRF according to the manufacturer's recommendations (Sequenom, San Diego, CA, USA). Genotype distributions were compared between patients and controls. Unconditional logistic regression analysis was used to estimate the association between the genotype frequency and the risk of developing AML.

TCGA analysis
To further understand the impact of included genes on AML patients, the cancer genomic atlas (TCGA) acute myeloid leukemia pancancer atlas cohort was included [25]. Mutational and RNAseq data were obtained to investigate the prognostic value of our genes of interest in terms of expression and mutational status. Data on 200 AML patients were extracted from the TCGA project through cBioPortal and survival analysis as conducted using the web-based computation tool UCSC Xena tool [26,27]. Oncoprint graphs were generated using cBioPortal to illustrate the genomics features of the included cohort. Cut-off points for mRNA expression data was generated using the X-Tile program [28].

Statistical analysis
Categorical variables were reported as the number of cases (percentage) and compared using the Pearson's Chi square (χ2) test or Fisher's exact test as appropriate. Continuous variables were expressed as mean ± standard deviation (SD) if normally distributed and compared using the independent Student's t test or one-way analysis of variance (ANOVA) as appropriate. The probabilities of Event-free survival (EFS) and Overall survival (OS) were estimated using the Kaplan-Meier method and were compared among subsets of patients using the logrank test. For all statistical analyses, the P values were two-sided, and a P value of <0.05 was deemed statistically significant. Data statistical analyses were performed using the statistical package for the social sciences (SPSS Statistics for Windows, Version 20.0; IBM Corp., Armonk, NY, USA). Genotypes and alleles frequency was estimated. Genotype frequencies were compared with the frequencies expected by the Hardy-Weinberg equilibrium (HWE) using a χ2 goodness of fit test.

Patients characteristic
A total number of 30 AML patients and 225 healthy controls were included in this study. Females comprised 50% (n = 15) and 65.3% (n = 147) of patients and controls, respectively.

Allele frequency
For our 5 SNP panel, 4 SNPs had no significant differences between cases and controls when comparing single allele as well as genotype frequency as demonstrated in Table 3. The NAT2 rs1799930 SNP had a statistically significant difference in genotype frequency between cases and controls as the GG genotype was the most common among the cases (69%), whereas the GA genotype was the most common among controls (47%). Allele frequency as 79% for G in patients and 67% for controls.

Genetic models comparison
Four modes of inheritance were considered for our SNPs of interest; codominant, dominant, recessive and overdominant. For the rs1799930 SNP, significant differences were found in the codominant model in which the GA genotype was more frequent among controls (OR: 3.58, 95% CI: 1.38-9.29). The dominant model as well showed a significant difference in which the GG genotype had more odds among cases (OR: 2.88, 95% CI: 1.25-6.61). Lastly, the overdominant model also showed a significant difference in which GG and AA genotypes were more common in the cases cohort (OR: 3.37, 95% CI: 1.32-8.60) Table 3. Table 4 includes details on the rs1799930 modes of inheritance.

Survival analysis
Kaplan Meier plots for OS stratified based on different modes of inheritance for rs1799930 showed different trends. For instance, despite being statistically insignificant in the codominant mode stratification, patients with an AA genotype had a higher probability of survival after around 3 years, followed by those with GG and lastly patients with a heterozygous genotype of GA (median OS-AA: 33.7 months, GG: 30.8 months, GA: 9.1 months, P = 0.334 (Fig  1). Since there where only one IDH mutation (IDH2 rs121913502), survival analysis was not feasible.
TCGA mutational analysis (Fig 2) Illustrates an Oncoprint displaying the distribution of included genes' mutations across the TCGA AML cohort coupled with an RNAseq heat-map showing corresponding expression. Interestingly, IDH mutations were reported in 20% of patients, whereas NAT2 was only mutated in 1 patient which had an amplification mutation. As for IDH mutated patients there was no significant difference in OS regardless of the IDH type (Fig 3).
Included TCGA AML patients were stratified into high and low expression groups based on a cut-off point determined using the X-Tile software. Both NAT2 and IDH1 stratification did not result in a statistically significant difference although numerically favoring the lower expression group (Fig 4).

Discussion
The impact of various SNPs has been widely studied in AML which in return has contributed to a deeper understanding of underlying mechanisms for leukemogenesis, treatment response as well as toxicity leading to improved prognosis and more adequate treatment strategies. Genetic variations are important to consider within the context of interpopulation variations, and our study serves to be the first from the region to investigate our SNPs of interest in AML patients. Among the five SNPs (NAT2 (rs1799930 and rs1799931), IDH1 (rs121913500), and IDH2 (rs121913502 and rs1057519736)) that we analyzed to study the relationship between single nucleotide polymorphisms (SNPs) and the risk and overall survival of AML in Jordanian   patients we found that NAT2 rs1799930 is the only SNP to show a statistically significant difference in genotype frequency between cases and controls (p = 0.023), while other SNPs did not correlate with the risk and survival of AML in our population.
NAT2 is a gene located on chromosome 8 that functions as an acetylator against different carcinogens with heterocyclic amines and amines with a carbon-only aromatic ring. Thus, NAT2 might carry a protective role against cancer development. There are NAT2 SNPs that affect the phenotype significantly, fast acetylators including rs1041983 and rs1799929 and slow acetylators including rs1799930 and rs1799931. We only investigated the role of the latter in the survival and risk of having AML. Kaplan Meier plots for OS stratified based on different modes of inheritance for rs1799930 showed different trends. For instance, despite being statistically insignificant in the codominant mode stratification, patients with an AA genotype had a higher probability of survival after around 3 years, followed by those with GG and lastly patients with a heterozygous genotype of GA (median OS-AA: 33.7 months, GG: 30.8 months, GA: 9.1 months, P = 0.334. In our study, rs1799930 NAT2 was associated with a higher odd of having AML while rs1799931 did not show increased odds of having AML. A meta-analysis by Tian et al. showed that rs1799930 was a potential risk factor for having AML while rs1799931 was, surprisingly, a protective factor against different cancers [29]. In addition, these results were confirmed in acute leukemia by another meta-analysis by Zhu et al. in rs1799931 [19]. However, Jiang et al. found different results of no increased risk of having AML for both rs1799930 and rs1799931 [20]. To understand NAT2's role in AML, we further studied the multi-omics data of the TCGA-LAML cohort. NAT2 RNA-seq. expression was not correlated with better survival. NAT2 mutation was rare as only one patient had a NAT2 mutation. This might be due to the different ethnic groups between our Arab patients and the American patients in the TCGA cohort. On this note, Zhu et al. noted a different risk of acute leukemia in rs1799931 but not rs1799930 patients in studies that included mixed ethnic groups [19]. In addition, Caucasian rs1801280 patients were found to have the highest risk of acute leukemia. However, the results were limited by the heterogeneity of the included case-control studies.  Both IDH1 and IDH2 are among the most common mutations in AML. These mutations have been found to drive the formation of 2-hydroxyglutarate. High 2-hydroxyglutarate lead to a hypermethylated status, aberrant gene splicing, and impairment of hematopoietic differentiation [30]. Thus, Ivosidenib (IDH1 inhibitor) and enasidenib (IDH2 inhibitor) were developed and FDA-approved for R/R AML [22]. Our ability to further investigate the prognostic value of IDH mutations and to appropriately study the correlation between IDH mutations and overall survival in AML patients was jeopardized by the fact that we have limited sample size (only one patient with IDH2 rs121913502 mutation) and failing to control confounding factors such as smoking. On the other hand, the incidence of the IDH mutations in the TCGA-LAML cohort was 20%. Still, the overall survival did not significantly differ based on IDH mutations. The reported prognostic role of IDH1/2 mutations is still unclear. In a metaanalysis by Wang and colleagues [24], IDH mutations were not associated with OS (Hazard ratio [HR]: 1.05, 95% CI 0.89-1.23). However, when the data was stratified based on IDH1 and IDH2, IDH1 was correlated with worse OS (HR: 1.17, 95% CI 1.05-1.31) while IDH2 was associated with better OS (HR: 0.78, 95% CI 0.66-0.93).
There were several limitations in our study. First, the sample size among the cases was small (n = 30). Second, other SNPs for the investigated genes were not assessed. Third, this study only covered the north Jordanian population. Fourth, patients' exposure to possible carcinogens like smoking was not fully assessed. Finally, our survival analysis should be interpreted with caution as the retrospective design and small sample size might reflect inaccurate results.

Conclusion
Our study indicates that rs1799930 NAT2 is significantly present in AML patients while IDH mutations did not correlate with the risk and survival of AML in the Jordanian population. These results were also similar in the TCGA-LAML cohorts with the notable exception of the rare NAT2 mutation. A larger cohort study is needed to further investigate our results.